skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Nguyen, Thy"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. In this work, we study the problem of privately maximizing a submodular function in the streaming setting. Extensive work has been done on privately maximizing submodular functions in the general case when the function depends upon the private data of individuals. However, when the size of the data stream drawn from the domain of the objective function is large or arrives very fast, one must privately optimize the objective within the constraints of the streaming setting. We establish fundamental differentially private baselines for this problem and then derive better trade-offs between privacy and utility for the special case of decomposable submodular functions. A submodular function is decomposable when it can be written as a sum of submodular functions; this structure arises naturally when each summand function models the utility of an individual and the goal is to study the total utility of the whole population as in the well-known Combinatorial Public Projects Problem. Finally, we complement our theoretical analysis with experimental corroboration. 
    more » « less
  2. Recently a multi-agent variant of the classical multi-armed bandit was proposed to tackle fairness issues in online learning. Inspired by a long line of work in social choice and economics, the goal is to optimize the Nash social welfare instead of the total utility. Unfortunately previous algorithms either are not efficient or achieve sub-optimal regret in terms of the number of rounds. We propose a new efficient algorithm with lower regret than even previous inefficient ones. We also complement our efficient algorithm with an inefficient approach with regret that matches the lower bound for one agent. The experimental findings confirm the effectiveness of our efficient algorithm compared to the previous approaches. 
    more » « less
  3. We consider the problem of clustering in the learning-augmented setting. We are given a data set in $$d$$-dimensional Euclidean space, and a label for each data point given by a predictor indicating what subsets of points should be clustered together. This setting captures situations where we have access to some auxiliary information about the data set relevant for our clustering objective, for instance the labels output by a neural network. Following prior work, we assume that there are at most an $$\alpha \in (0,c)$$ for some $c<1$ fraction of false positives and false negatives in each predicted cluster, in the absence of which the labels would attain the optimal clustering cost $$\mathrm{OPT}$$. For a dataset of size $$m$$, we propose a deterministic $$k$$-means algorithm that produces centers with an improved bound on the clustering cost compared to the previous randomized state-of-the-art algorithm while preserving the $$O( d m \log m)$$ runtime. Furthermore, our algorithm works even when the predictions are not very accurate, i.e., our cost bound holds for $$\alpha$$ up to $1/2$, an improvement from $$\alpha$$ being at most $1/7$ in previous work. For the $$k$$-medians problem we again improve upon prior work by achieving a biquadratic improvement in the dependence of the approximation factor on the accuracy parameter $$\alpha$$ to get a cost of $$(1+O(\alpha))\mathrm{OPT}$$, while requiring essentially just $$O(md \log^3 m/\alpha)$$ runtime. 
    more » « less
  4. This paper studies the problem of clustering in metric spaces while preserving the privacy of individual data. Specifically, we examine differentially private variants of the k-medians and Euclidean k-means problems. We present polynomial algorithms with constant multiplicative error and lower additive error than the previous state-of-the-art for each problem. Additionally, our algorithms use a clustering algorithm without differential privacy as a black-box. This allows practitioners to control the trade-off between runtime and approximation factor by choosing a suitable clustering algorithm to use. 
    more » « less